Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting source files between CUDA and CPU code. #172

Merged
merged 1 commit into from
Feb 27, 2014

Conversation

erictzeng
Copy link
Contributor

Addresses #152. Tests pass.

@Yangqing, I modeled this after some of the layer files that were already split between .cpp and .cu, but I'd appreciate a sanity check just to ensure that this way of splitting things isn't going to cause much pain and misery in the future. :)

@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

@erictzeng congrats, it compiles, build and pass the tests in OSX 10.7

@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

@erictzeng Something weird happened, the second time multinomial_logistic_loss_layer test failed. This could be a problem of the test itself @Yangqing?

$ ./build/src/caffe/test/test_multinomial_logistic_loss_layer.testbin 
Cuda number of devices: 1
Current device id: 0
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from MultinomialLogisticLossLayerTest/0, where TypeParam = float
[ RUN      ] MultinomialLogisticLossLayerTest/0.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/0.TestGradientCPU (284 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/0 (284 ms total)

[----------] 1 test from MultinomialLogisticLossLayerTest/1, where TypeParam = double
[ RUN      ] MultinomialLogisticLossLayerTest/1.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/1.TestGradientCPU (4 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/1 (4 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (288 ms total)
[  PASSED  ] 2 tests.
$ ./build/src/caffe/test/test_multinomial_logistic_loss_layer.testbin 
Cuda number of devices: 1
Current device id: 0
[==========] Running 2 tests from 2 test cases.
[----------] Global test environment set-up.
[----------] 1 test from MultinomialLogisticLossLayerTest/0, where TypeParam = float
[ RUN      ] MultinomialLogisticLossLayerTest/0.TestGradientCPU
./src/caffe/test/test_gradient_check_util.hpp:124: Failure
The difference between computed_gradient and estimated_gradient is 0.026286721229553223, which exceeds threshold_ * scale, where
computed_gradient evaluates to -1.9747008085250854,
estimated_gradient evaluates to -2.0009875297546387, and
threshold_ * scale evaluates to 0.020009875297546387.
debug: (top_id, top_data_id, blob_id, feat_id)=-1,-1,0,38
[  FAILED  ] MultinomialLogisticLossLayerTest/0.TestGradientCPU, where TypeParam = float (317 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/0 (317 ms total)

[----------] 1 test from MultinomialLogisticLossLayerTest/1, where TypeParam = double
[ RUN      ] MultinomialLogisticLossLayerTest/1.TestGradientCPU
[       OK ] MultinomialLogisticLossLayerTest/1.TestGradientCPU (3 ms)
[----------] 1 test from MultinomialLogisticLossLayerTest/1 (3 ms total)

[----------] Global test environment tear-down
[==========] 2 tests from 2 test cases ran. (320 ms total)
[  PASSED  ] 1 test.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] MultinomialLogisticLossLayerTest/0.TestGradientCPU, where TypeParam = float

@shelhamer
Copy link
Member

@sguada it's just a precision error. This test was relaxed in the boost-eigen branch by slightly increasing epsilon 59584ad2223a27ed5090ee13b2f452f0a5c09eff.

Note that this does not include the MKL/non-MKL merge yet, which is what is broken on OSX. However, we expect the combination of this and #165 to fix #122.

@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

@shelhamer I think 0.02 is quite big for a precision error.

@shelhamer
Copy link
Member

I'm quoting a previous conversation with @Yangqing, who said it was safe to ignore. Thank you for bringing up the question–we should aim for better test documentation.

@mavenlin
Copy link
Contributor

Alex also said it was safe to ignore.
https://code.google.com/p/cuda-convnet/wiki/CheckingGradients

@Yangqing
Copy link
Member

The multinomial loss precision should be fine :)

Splitting this way should be fine too.

Yangqing

On Wed, Feb 26, 2014 at 7:29 PM, Lin Min [email protected] wrote:

Alex also said it is safe to ignore.
https://code.google.com/p/cuda-convnet/wiki/CheckingGradients

Reply to this email directly or view it on GitHubhttps://github.com//pull/172#issuecomment-36206678
.

shelhamer added a commit that referenced this pull request Feb 27, 2014
Split source files between CUDA and CPU code. Pave the way for #3 and #122.
@shelhamer shelhamer merged commit 40a1548 into BVLC:dev Feb 27, 2014
@shelhamer
Copy link
Member

@erictzeng let's figure out #165 now. I'm doing a rebase, then we can check what's what.

@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

@shelhamer @mavenlin thanks for the pointers. I think test should consistently pass or fail, that's part of the automatic testing. So please adjust the epsilon if it needs to be adjusted.
It will be nice to get a summary of all the test at the bottom that counts all the test that passed or failed. Not sure if that will be easy, but it will be helpful.

@shelhamer
Copy link
Member

Sergio, it is already adjusted–see the commit I linked–but that change just hasn't made it in yet. I'm trying to bring it in now.

Testing could be improved by fixing #173 and #174.

@sguada
Copy link
Contributor

sguada commented Feb 27, 2014

Thanks Evan, I would think about how to test the mat wrapper as mentioned in #173 quickly

@tdomhan
Copy link
Contributor

tdomhan commented Feb 27, 2014

I'm wondering: has there been work done to compile it without the cuda parts? From quickly skimming the changes(and correct me if I'm wrong), it looks like the classes are split up, but both CPU and the GPU code need to be compiled. In any case, this is of course a very useful first step for building a CPU only version.

@erictzeng
Copy link
Contributor Author

You're right in that as of right now, CPU and GPU still have to be compiled together. Clean separation of the two so that Caffe can be compiled without GPU is one of our short-term goals, though, so consider this as just step 1 in an ongoing process!

@sguada sguada mentioned this pull request Mar 18, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
Split source files between CUDA and CPU code. Pave the way for BVLC#3 and BVLC#122.
lopho pushed a commit to lopho/caffe that referenced this pull request Jun 29, 2016
Added new GaussianStatic dummy layer
wk910930 pushed a commit to wk910930/caffe that referenced this pull request Jun 21, 2017
fix the bug in input reshape of cuDNN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants